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1.  Introduction 


Recently,  there  has  been  a  tremendous  boom  in  the  application  of  descriptive  and 
inferential  statistical  techniques  (more  popularly  known  as  “machine  learning”)  in 
a  number  of  fields.  The  medical  industry  is  using  machine  learning  (ML)  and 
healthcare  anal3^ics  to  assist  physicians  in  the  clinical  decision-making  process.^ 
Materials  scientists  are  using  ML  to  design  molecules  that  have  specific  properties 
and  function.^  From  speech  recognition,^  fraud  prevention,^  spam  email  filtering,^ 
unmanned  vehicle  operation,®  finance,^  and  even  drunk  driver  detection,*  ML  has 
now  become  an  indispensable  tool  impacting  multiple  fields  and  industries. 

Computational  chemistry  is  also  reaping  benefits  from  ML  and  it  has  been  used  to 
develop  parameters  for  semi-empirical  quantum  mechanical  (QM)  Hamiltonians^ 
and  for  interpolation  of  ab  initio  potential  energy  surfaces.^®  The  latter  application 
is  particularly  appealing  as  it  provides  a  route  toward  rapid  evaluation  of 
configurational  energies  and  forces  using  statistical  methods  that  potentially  have 
a  QM  level  of  accuracy.  Further,  since  most  ML  methods  only  rely  on  the 
underlying  training  data  for  prediction  (hyperparameters  and  assumptions  of 
statistical  distributions  aside),  they  may  not  suffer  the  same  maladies  that  plague 
conventional  functional  forms  such  as  Tersoff  bond  order  potentials  that  fail  at  high 
pressure  due  to  discontinuous  changes  in  the  bond-order  term.** 

ML  representations  of  QM  potential  energy  surfaces  have  been  developed  using  a 
variety  of  ab  initio  methods  including  density  functional  theory  (DFT)  and  coupled 
cluster  theory  for  covalently  bonded  systems.  However,  for  energetic  molecular 
crystals  of  interest  to  the  Army,  accurate  representation  of  noncovalent  interactions 
is  critical.  Gao  et  al.*^  used  ML  to  develop  van  der  Waals  corrections  for  DFT. 
McGibbon  et  al.**  developed  a  hybrid  ML  and  QM  methodology  for  computation 
of  interaction  energies  where  a  neural  network,  trained  using  coupled  cluster 
reference  data,  was  used  to  correct  Moller-Plesset  (MP2)  interaction  energies. 
Using  this  combined  approach,  they  obtained  a  6-fold  improvement  in  accuracy 
relative  to  conventional  MP2. 

Given  the  importance  of  noncovalent  interactions  in  energetic  molecular  crystals. 
Army  scientists  have  focused  on  development  and  application  of  QM  approaches 
that  can  accurately  describe  weak  electron  correlations  (“dispersion”)  between 
molecules.  One  successful  technique  uses  a  combination  of  DFT  and  symmetry- 
adapted  perturbation  theory  (SAPT)  known  as  SAPT(DFT).*'^  In  SAPT(DFT),  the 
intramonomer  correlation  is  treated  through  the  exchange-correlation  density 
functional  yielding  a  single  perturbative  series  representing  the  intermolecular 
interaction  which,  when  combined  with  resolution  of  the  identity  techniques. 
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reduces  the  computational  scaling.  Accurate  dispersion  energies,  of  particular 
importance  for  the  interlayer  interactions  in  molecular  crystals,  are  obtained  via 
application  of  the  generalized  Casimir- Polder  expression  with  frequency  dependent 
density  susceptibilities  computed  within  the  coupled  Kohn-Sham  approach.  Given 
its  combination  of  accuracy  and  efficiency,  SAPT(DFT)  has  been  used  to  develop 
intermolecular  potentials  for  the  cyclotrimethylene  trinitramine,*^  l,l-diamino-2,2- 
dinitroethylene,'®  and  l,3,5-triamino-2,4,6-trinitrobenzene*’  (TATB)  energetics 
using  conventional  functional  forms  such  as  exponential-6  (Exp-6). 

In  this  work,  the  previously  computed  SAPT(DFT)  reference  data  for  TATB  were 
used  to  develop  new  intermolecular  potentials  using  3  ML  methods: 

.  Support  vector  regression 

.  Kernel  ridge  regression^^ 

.  Neural  networks^® 

These  ML  models  differ  from  the  previous  work*’  that  used  a  parametric  function 
of  Exp-6  form  to  fit  the  SAPT(DFT)  reference  data.  The  Exp-6  potential  requires 
168  descriptors  for  each  dimer  configuration  (144  Cartesian  coordinates  and  24 
charges)  as  input.  On  the  contrary,  the  new  ML  models  use  a  reduced  descriptor  set 
and  only  require  6  input  variables:  the  center  of  mass  separation  between  monomers 
(R)  and  5  Euler  angles  defining  the  monomer  orientations.  The  ML  models  were 
applied  to  potential  energy  surface  cross  sections  of  minima  reported  in  the 
previous  work.*’  It  is  observed  that  stable  dimer  configurations  are  accurately 
described  by  the  ML  models  and  that  support  vector  regression  (SVR),  kernel  ridge 
regression  (KRR),  and  neural  networks  can  be  used  to  accurately  interpolate 
SAPT(DFT)  surfaces. 

2.  Computational  Methods 


2.1  Quantum  Mechanical  Reference  Data 

The  ML  methods  were  trained  using  a  grid  of  880  randomly  configured  TATB 
dimer  configurations  computed  previously  using  SAPT(DFT).  Full  details  of  the 
SAPT(DFT)  calculations  are  given  in  Taylor*’.  In  summary,  the  geometry  of  the 
TATB  monomer  (Fig.  I)  used  in  the  calculations  was  taken  from  the  experimental 
unit  cell  and  all  interaction  energy  calculations  used  an  aug-cc-pVDZ  basis 
supplemented  by  a  set  of  3s(a  =  0.9,0. 3,0.1)  3p(a  =  0.9,0. 3,0.1)  2d(a  =  0.6,0. 2) 
2f(a  =  0.6,0. 2)  “midbond”  functions  with  a  PBEO  density  functional  description  of 
the  monomers.  In  the  previous  work,*’  the  SAPT(DFT)  interaction  energies  were 
fit  to  an  Exp-6  functional  form  using  the  PIKAIA^*  genetic  algorithm  with  a 
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population  of  100  individuals  that  evolved  for  500  generations.  Fitness  seoring  of 
the  individuals  in  the  population  was  given  by  the  magnitude  of  the  root  mean 
square  deviation  between  the  fitted  and  referenee  interaetion  energies. 


Monomer  Unit  Cell 


Crystal 


a 


Fig.  1  Molecular  and  condensed  phase  structure  of  TATB.  Carbons  are  green,  nitrogens 
are  blue,  oxygens  are  red,  and  hydrogens  are  white. 

2.2  Machine  Learning  Methods 

Using  the  SAPT(DFT)  reference  energies,  3  ML  potentials  were  developed  using 
SVR,  KRR,  and  a  neural  network  of  radial  basis  functions.  The  SVR  and  KRR 
potentials  were  developed  using  the  sklearn^^  modules  available  in  Python  and  the 
neural  network  was  implemented  from  scratch  using  Python  code  developed  by  the 
author. 

2.2.1  Support  Vector  Regression 

SVR  is  an  extension  of  the  support  vector  classifier  and  is  used  for  prediction  of 
quantitative  instead  of  categorical  variables. In  general,  given  a  set  of  predictor 
variables  v,  one  wishes  to  optimize  the  weights  w,  and  intercept  b,  of  the  following 
function: 


y(x)  =  w0(x)  +  b  , 


(1) 
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where  0(x)  is  a  transformation  of  the  feature  space.  The  weights  and  intercept  are 
obtained  by  minimizing  the  constrained  objective  function: 

n:nn{i||w||2  +  CEi(^i  +  ^n} 

(  Yi—  w0(x)  —  b  <  £  +  (2) 

s.  t.  <  -Yi  +  w0(x)  +  b  <  £  +  , 

[ 

where  C,  £,  and  ^  collectively  control  the  maximum  allowable  error.  The  feature 
space  can  be  expanded  by  reformulating  the  above  expressions  in  terms  of  dot 
products  of  the  predictors  x.  When  written  in  terms  of  dot  products,  one  can  then 
take  advantage  of  the  “kernel  trick”,  which  allows  sampling  of  transformed  feature 
spaces  of  increased  dimension  (infinite  for  some  kernels)  without  having  to 
explicitly  sample  the  larger  space.  In  this  work,  the  radial  basis  function  kernel  was 
used: 


/f(Xi,x)  =  (3) 

and  the  parameters  C  and  y  in  Eqs.  2  and  3,  respectively,  were  determined  using 
cross-validation  (discussed  in  Section  2.2.4). 

2.2.2  Kernel  Ridge  Regression 

KRR  can  be  derived  from  Eq.  2  by  ignoring  the  bias  term  b,  setting  s  =  0,  and 
squaring  the  “slack  variables”  This  yields  the  following  expression: 

min{a||w|p  +  Y.i\yi  -  w0(x)]2}  ,  (4) 

W 

where  a  serves  as  a  regularization  parameter.  KRR  is  also  amenable  to  the  kernel 
trick,  and  the  radial  basis  function  kernel  of  Eq.  3  was  also  used  with 
hyperparameters  determined  by  cross-validation. 

2.2.3  Neural  Network 

Neural  networks  generally  consist  of  input  and  output  layers  that  are  separated  by 
hidden  layers  with  nodes  adjoined  by  weighted  connections  (Eig.  2).  The  weights 
can  be  determined  by  minimization  of  a  loss  function  (squared  error  for  example) 
using  back  propagation^^  for  analytically  differentiable  activation  functions  or  with 
stochastic  optimization  methods  such  as  genetic  algorithms. In  this  work,  a  feed¬ 
forward  network  was  developed  using  one  hidden  layer  of  radial  basis  functions, 
centered  on  the  SAPT(DET)  reference  configurations,  with  a  vector  containing  the 
center  of  mass  separation  and  Euler  angles  fed  into  the  input  layer.  When  using  a 
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single  input  node  and  one  hidden  layer,  the  weights  of  the  network  ean  be  solved 
analytically  by  inversion  of  the  Gram  matrix  K(x,  x')  that  is  computed  using  the 
radial  basis  function  kernel  in  Eq.  3.  The  exponent  y  defining  the  Gram  matrix  was 
determined  by  cross-validation  and  all  basis  functions  used  the  same  exponent. 


Fig.  2  General  schematic  of  a  neural  network  showing  input  nodes,  hidden  layer,  and 
output  node  with  weighted  connections 


2.2.4  Cross  Validation 

The  hyperparameters  for  each  model  were  determined  by  splitting  the  SAPT(DFT) 
reference  data  into  a  training  set  (766  configurations)  and  test  set  (86 
configurations)  followed  by  a  grid  search  over  hyperparameters  to  maximize  the 
coefficient  of  determination  (Q^)  for  the  test  set.  Correlation  plots  for  the  test  set 
using  each  ML  method  are  shown  in  Fig.  3  and  the  best  agreement  for  this  test  set 
was  obtained  using  KRR  with  =  0.85.  The  out-of-sample  performance  for  each 
ML  potential  is  deemed  acceptable  given  the  paucity  of  geometric  information  used 
as  input  for  each  configuration.  The  fits  could  likely  be  improved  by  using  the  full 
set  of  Cartesian  coordinates  for  each  configuration,  thereby  increasing  the 
dimension  of  the  predictor  space.  However,  given  that  SAPT(DFT)  is  a  rigid 
monomer,  intermolecular  theory,  the  internal  coordinates  would  be  somewhat 
extraneous  in  the  current  context. 
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Fig.  3  Interaction  energy  correlation  plots  for  SVR,  KRR,  and  nenral  network  potentials 
on  test  set  with  86  confignrations 

3.  Potential  Energy  Surface  Characterization 

The  ML  potentials  were  used  to  compute  potential  energy  surface  cross  sections  of 
8  stable  configurations  of  TATB  dimers  previously  identified  in  Taylor.  The 
structures  and  cross  sections  using  the  ML  potentials,  the  Exp-6  potential  from 
Taylor,*’  and  the  SAPT(DFT)  energies  are  presented  in  Figs.  4  and  5.  As  shown, 
the  topology  of  the  MF  potential  energy  surfaces  are  in  good  agreement  with  the 
ab  initio  data  and  stable  minima  on  the  ab  initio  surface  are  also  present  on  the  MF 
surfaces.  Clearly  one  cannot  locate  all  minima  on  the  surface  nor  can  it  be 
guaranteed  that  all  minima  on  the  fitted  surfaces  correspond  to  minima  on  the  ab 
initio  surface.  However,  for  the  configurations  presented  in  Taylor,*’  the  ab  initio 
potential  energy  surface  is  well  characterized  by  the  MF  potentials. 
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Energy  (kcal/mol)  Energy  (kcal/mol) 


Fig.  4  Potential  energy  surface  cross  sections  for  nitro  and  amine  substituent  interactions 
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Fig.  5  Potential  energy  surface  cross  sections  for  ring  interactions 

4.  Conclusion 


Although  the  ML  potentials  perform  well  for  the  stationary  points  presented  in  this 
work,  as  with  any  fitted  model,  there  will  be  regimes  where  the  accuracy  is  less 
than  optimal.  As  an  example,  during  testing  it  was  observed  that  at  a  large 
intermonomer  separation  (R>12  angstrom)  for  some  sample  configurations,  the 
SVR  potential  had  a  non-zero  (but  still  small)  interaction  energy  of  approximately 
-0. 1  kcal/mol.  This  is  to  be  compared  to  the  KRR  and  RBF  potentials  that  predicted 
energies  of  magnitude  less  than  10'^°  kcal/mol,  as  one  would  generally  expect  at 
large  separation.  This  is  likely  due  to  a  diffuse  exponent  used  in  the  radial  basis 
function  kernel  of  the  SVR  potential  that  results  in  non-negligible  contributions  to 
the  interaction  energy,  even  at  large  separation.  In  practice,  this  could  be  remedied 
by  including  more  asymptotic  points  in  the  training  set  to  obtain  a  better  exponent 
for  the  kernel  or  by  simply  using  a  cutoff  for  the  potential. 

It  is  possible  to  re-fit  the  ML  potentials  using  Cartesian  coordinates  for  the  dimers 
instead  of  the  reduced  set  of  descriptors  used  in  this  work.  Cartesian  potentials,  and 
the  associated  forces,  could  then  be  used  to  perform  molecular  dynamics 
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simulations  to  determine  if  the  ML  potentials  can  accurately  reproduce  the 
condensed  phase  crystal  structure.  This  work  is  underway  and  will  be  the  subject 
of  a  future  report. 
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List  of  Symbols,  Abbreviations,  and  Acronyms 


DFT 

density  functional  theory 

Exp-6 

exponential- 6 

KRR 

kernel  ridge  regression 

ML 

machine  learning 

MP2 

Moller-Plesset 

QM 

quantum  mechanical 

SAPT 

symmetry-adapted  perturbation  theory 

SVR 

support  vector  regression 

TATB 

1, 3, 5-triamino-2, 4, 6-trinitrobenzene 
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